This invited review discusses causal learning in the context of robotic intelligence. The paper introduced the psychological findings on causal learning in human cognition, then it introduced the traditional statistical solutions on causal discovery and causal inference. The paper reviewed recent deep causal learning algorithms with a focus on their architectures and the benefits of using deep nets and discussed the gap between deep causal learning and the needs of robotic intelligence.
translated by 谷歌翻译
视频容易篡改攻击,从而改变含义并欺骗观众。以前的视频伪造检测方案找到了微小的线索来定位篡改区域。但是,攻击者可以通过使用视频压缩或模糊破坏此类线索来成功逃避监督。本文提出了一个视频水印网络,用于篡改本地化。我们共同训练一个基于3D-UNET的水印嵌入网络和一个预测篡改面罩的解码器。水印嵌入产生的扰动几乎是无法察觉的。考虑到没有现成的可区分的视频编解码器模拟器,我们建议通过结合其他典型攻击的模拟结果来模仿视频压缩,例如JPEG压缩和模糊,作为近似值。实验结果表明,我们的方法生成具有良好不可识别的水印视频,并且在攻击版本中可以稳健,准确地定位篡改区域。
translated by 谷歌翻译
排名汇总旨在将许多替代品的偏好排名与不同选民的偏替排名组合成单一共识排名。然而,作为各种实际应用的有用模型,它是一个计算上有挑战性的问题。在本文中,我们提出了一种有效的混合进化排名算法来解决完整和部分排名的排名聚集问题。该算法具有基于协调对的语义交叉,并通过有效的增量评估技术加强了较晚的验收本地搜索。进行实验以评估算法,与最先进的算法相比,表明基准实例上具有高度竞争性能。为了展示其实际有用性,算法应用于标签排名,这是一个重要的机器学习任务。
translated by 谷歌翻译
抽象的。目的:本文提出了一种用于产生虚拟术中CT扫描的方案,以改善内窥镜窦手术(ESS)的手术完整性。方法:该工作呈现三种方法,基于尖端运动,基于尖端轨迹的基于仪器,以及基于仪器,以及虚拟术中CT生成的非参数平滑和高斯过程回归。结果:所提出的方法研究,并在尸体上进行的ESS进行了比较。外科结果表明,所有三种方法都改善了骰子相似系数> 86%,F分数> 92%和精度> 89.91%。发现基于尖端轨迹的方法具有最佳性能,并在外科完整性评估中获得了96.87%的精度。结论:这项工作表明,虚拟术中CT扫描改善了实际手术场景与参考模型之间的一致性,并提高了ESS中的手术完整性。与实际的术中CT扫描相比,该方案对现有的外科议定书没有影响,不需要除了最多的ESS中已经提供的额外硬件克服了高成本,重复辐射和由实际术中引起的细长麻醉CTS,并在ESS中实用。
translated by 谷歌翻译
内镜窦和头骨基础手术(Essbss)是一个具有挑战性和潜在的危险的外科手术,客观技能评估是提高手术训练有效性的关键组成部分,重新​​验证外科医生的技能,并降低手术创伤和并发症手术室的速度。由于外科手术的复杂性,操作风格的变化,以及新的外科技能的快速发展,外科技能评估仍然是一个具有挑战性的问题。这项工作提出了一种新颖的高斯过程学习的启发式自动客观外科手术技能评估方法。不同于经典的外科技能评估算法,所提出的方法1)利用外科仪器相对运动中的运动学特征,而不是使用特定的外科任务或统计数据实时评估技能; 2)提供信息丰富的反馈,而不是总结分数; 3)能够逐步从新数据逐步学习,而不是根据固定的数据集。该方法将仪器运动投射到内窥镜坐标中以减少数据维度。然后,它提取投影数据的运动学特征,并学习外科技能水平与高斯过程学习技术的特征之间的关系。该方法在全内镜颅底和尸体上的鼻窦手术中核实。这些手术具有不同的病理学,需要不同的治疗并具有不同的复杂性。实验结果表明,该方法达到了100 \%的预测精度,用于完整的外科手术和90 \%的实时预测评估精度。
translated by 谷歌翻译
口语语言理解已被处理为监督的学习问题,其中每个域都有一组培训数据。但是,每个域的注释数据都是经济昂贵和不可扩展的,因此我们应该充分利用所有域的信息。通过进行多域学习,使用跨域的联合训练的共享参数来解决一个现有方法解决问题。我们建议通过使用域特定和特定于任务的模型参数来改善该方法的参数化,以改善知识学习和传输。5个域的实验表明,我们的模型对多域SLU更有效,并获得最佳效果。此外,当适应具有很少数据的新域时,通过优于12.4 \%来表现出先前最佳模型的可转换性。
translated by 谷歌翻译
Deep convolutional neural networks (CNNs) have been widely used for medical image segmentation. In most studies, only the output layer is exploited to compute the final segmentation results and the hidden representations of the deep learned features have not been well understood. In this paper, we propose a prototype segmentation (ProtoSeg) method to compute a binary segmentation map based on deep features. We measure the segmentation abilities of the features by computing the Dice between the feature segmentation map and ground-truth, named as the segmentation ability score (SA score for short). The corresponding SA score can quantify the segmentation abilities of deep features in different layers and units to understand the deep neural networks for segmentation. In addition, our method can provide a mean SA score which can give a performance estimation of the output on the test images without ground-truth. Finally, we use the proposed ProtoSeg method to compute the segmentation map directly on input images to further understand the segmentation ability of each input image. Results are presented on segmenting tumors in brain MRI, lesions in skin images, COVID-related abnormality in CT images, prostate segmentation in abdominal MRI, and pancreatic mass segmentation in CT images. Our method can provide new insights for interpreting and explainable AI systems for medical image segmentation. Our code is available on: \url{https://github.com/shengfly/ProtoSeg}.
translated by 谷歌翻译
互联网技术的发展不断增强谣言和虚假新闻的传播和破坏力。先前关于多媒体假新闻检测的研究包括一系列复杂的功能提取和融合网络,以实现图像和文本之间的特征对齐。但是,多模式功能由什么组成,以及来自不同模式的特征如何影响决策过程仍然是开放的问题。我们介绍了Aura,这是一个具有自适应单峰表示聚合的多模式假新闻检测网络。我们首先从图像模式,图像语义和文本中分别提取表示形式,并通过将语义和语言表示形式发送到专家网络来生成多模式表示。然后,我们根据单峰和多模式表示,进行粗级的虚假新闻检测和跨模式宇宙性学习。分类和一致性得分被映射到模态感知的注意分数,以重新调整功能。最后,我们汇总并将加权功能分类用于精制的假新闻检测。关于微博和八卦的综合实验证明,Aura可以成功击败几个最先进的FND方案,在该方案中,整体预测准确性和对假新闻的回忆得到稳步改善。
translated by 谷歌翻译
In this paper, we propose a robust 3D detector, named Cross Modal Transformer (CMT), for end-to-end 3D multi-modal detection. Without explicit view transformation, CMT takes the image and point clouds tokens as inputs and directly outputs accurate 3D bounding boxes. The spatial alignment of multi-modal tokens is performed implicitly, by encoding the 3D points into multi-modal features. The core design of CMT is quite simple while its performance is impressive. CMT obtains 73.0% NDS on nuScenes benchmark. Moreover, CMT has a strong robustness even if the LiDAR is missing. Code will be released at https://github.com/junjie18/CMT.
translated by 谷歌翻译
Dataset distillation has emerged as a prominent technique to improve data efficiency when training machine learning models. It encapsulates the knowledge from a large dataset into a smaller synthetic dataset. A model trained on this smaller distilled dataset can attain comparable performance to a model trained on the original training dataset. However, the existing dataset distillation techniques mainly aim at achieving the best trade-off between resource usage efficiency and model utility. The security risks stemming from them have not been explored. This study performs the first backdoor attack against the models trained on the data distilled by dataset distillation models in the image domain. Concretely, we inject triggers into the synthetic data during the distillation procedure rather than during the model training stage, where all previous attacks are performed. We propose two types of backdoor attacks, namely NAIVEATTACK and DOORPING. NAIVEATTACK simply adds triggers to the raw data at the initial distillation phase, while DOORPING iteratively updates the triggers during the entire distillation procedure. We conduct extensive evaluations on multiple datasets, architectures, and dataset distillation techniques. Empirical evaluation shows that NAIVEATTACK achieves decent attack success rate (ASR) scores in some cases, while DOORPING reaches higher ASR scores (close to 1.0) in all cases. Furthermore, we conduct a comprehensive ablation study to analyze the factors that may affect the attack performance. Finally, we evaluate multiple defense mechanisms against our backdoor attacks and show that our attacks can practically circumvent these defense mechanisms.
translated by 谷歌翻译